Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protect migrated volume from node vm deletion #1762

Conversation

divyenpatel
Copy link
Member

@divyenpatel divyenpatel commented May 31, 2022

What this PR does / why we need it:
Migrated in-tree vSphere volumes does not have FCD control flag - keepAfterDeleteVm.
After volumes are migrated to CSI and registered as FCD, if Node VM is deleted while volumes are attached to them, volumes are deleted along with it.
This is a data loss.
This PR helps add keepAfterDeleteVm control flag to the migrated volumes.

Note: This is not happening for Volumes directly created by CSI driver, as when FCD is created, by default it gets keepAfterDeleteVm control flag.

Testing done:

Created in-tree vSphere volume

% kubectl get pvc 
NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
vcp-pvc   Bound    pvc-71c27d29-2ab0-4850-9733-fda3b9236625   2Gi        RWO            vcp-sc         20h
divyenp@divyenp-a02 vsphere-csi-driver % kubectl describe pv pvc-71c27d29-2ab0-4850-9733-fda3b9236625
Name:            pvc-71c27d29-2ab0-4850-9733-fda3b9236625
Labels:          <none>
Annotations:     kubernetes.io/createdby: vsphere-volume-dynamic-provisioner
                 pv.kubernetes.io/bound-by-controller: yes
                 pv.kubernetes.io/migrated-to: csi.vsphere.vmware.com
                 pv.kubernetes.io/provisioned-by: kubernetes.io/vsphere-volume
Finalizers:      [kubernetes.io/pv-protection external-attacher/csi-vsphere-vmware-com]
StorageClass:    vcp-sc
Status:          Bound
Claim:           default/vcp-pvc
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        2Gi
Node Affinity:   <none>
Message:         
Source:
    Type:               vSphereVolume (a Persistent Disk resource in vSphere)
    VolumePath:         [vsanDatastore] 949d9662-8812-1c79-7565-0200ad8e5983/kubernetes-dynamic-pvc-71c27d29-2ab0-4850-9733-fda3b9236625.vmdk
    FSType:             ext4
    StoragePolicyName:  
Events:                 <none>

Volume is migrated to CSI and registered as FCD

% kubectl get cnsvspherevolumemigrations 403c3770-397e-4b10-bc4d-76deb9ad93e8 -o yaml
apiVersion: cns.vmware.com/v1alpha1
kind: CnsVSphereVolumeMigration
metadata:
  creationTimestamp: "2022-05-31T23:35:49Z"
  generation: 1
  name: 403c3770-397e-4b10-bc4d-76deb9ad93e8
  resourceVersion: "1843838"
  uid: e1f28fd4-5a55-4234-ad46-f44d7d731d77
spec:
  protectvolumefromvmdelete: false
  volumeid: 403c3770-397e-4b10-bc4d-76deb9ad93e8
  volumepath: '[vsanDatastore] 949d9662-8812-1c79-7565-0200ad8e5983/kubernetes-dynamic-pvc-71c27d29-2ab0-4850-9733-fda3b9236625.vmdk'

Created Pod using this volume

# kubectl create -f vcp-pod.yaml 
pod/pod created

Control flag keepAfterDeleteVm is set on the migrated volume

{"level":"info","time":"2022-06-01T19:45:03.413237987Z","caller":"vanilla/controller.go:997","msg":"ControllerPublishVolume: called with args {VolumeId:[vsanDatastore] 949d9662-8812-1c79-7565-0200ad8e5983/kubernetes-dynamic-pvc-71c27d29-2ab0-4850-9733-fda3b9236625.vmdk NodeId:420234a6-6215-39e5-4a89-d7bd17671b9e VolumeCapability:mount:<fs_type:\"ext4\" > access_mode:<mode:SINGLE_NODE_WRITER >  Readonly:false Secrets:map[] VolumeContext:map[] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}","TraceId":"17f8a582-f3f1-4fb7-8d23-3e7f5b818a79"}
{"level":"info","time":"2022-06-01T19:45:03.429494015Z","caller":"migration/migration.go:254","msg":"Set FCD_KEEP_AFTER_DELETE_VM control flag using Vslm APIs","TraceId":"17f8a582-f3f1-4fb7-8d23-3e7f5b818a79"}
{"level":"info","time":"2022-06-01T19:45:05.07385122Z","caller":"volume/manager.go:2286","msg":"Successfully set FCD_KEEP_AFTER_DELETE_VM control flag for volumeID: \"403c3770-397e-4b10-bc4d-76deb9ad93e8\"","TraceId":"17f8a582-f3f1-4fb7-8d23-3e7f5b818a79"}
{"level":"info","time":"2022-06-01T19:45:05.074630732Z","caller":"migration/migration.go:263","msg":"Migrated Volume with ID: \"403c3770-397e-4b10-bc4d-76deb9ad93e8\" is protected from VM deletion","TraceId":"17f8a582-f3f1-4fb7-8d23-3e7f5b818a79"}
{"level":"info","time":"2022-06-01T19:45:08.06504874Z","caller":"volume/manager.go:635","msg":"AttachVolume: volumeID: \"403c3770-397e-4b10-bc4d-76deb9ad93e8\", vm: \"VirtualMachine:vm-51 [VirtualCenterHost: 10.187.120.232, UUID: 420234a6-6215-39e5-4a89-d7bd17671b9e, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: 10.187.120.232]]\", opId: \"1930d337\"","TraceId":"17f8a582-f3f1-4fb7-8d23-3e7f5b818a79"}
{"level":"info","time":"2022-06-01T19:45:08.065293535Z","caller":"volume/manager.go:670","msg":"AttachVolume: Volume attached successfully. volumeID: \"403c3770-397e-4b10-bc4d-76deb9ad93e8\", opId: \"1930d337\", vm: \"VirtualMachine:vm-51 [VirtualCenterHost: 10.187.120.232, UUID: 420234a6-6215-39e5-4a89-d7bd17671b9e, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-3, VirtualCenterHost: 10.187.120.232]]\", diskUUID: \"6000C295-2e19-e082-83c7-2c1c8b2179f5\"","TraceId":"17f8a582-f3f1-4fb7-8d23-3e7f5b818a79"}
{"level":"info","time":"2022-06-01T19:45:08.065454458Z","caller":"vanilla/controller.go:1101","msg":"ControllerPublishVolume successful with publish context: map[diskUUID:6000c2952e19e08283c72c1c8b2179f5 type:vSphere CNS Block Volume]","TraceId":"17f8a582-f3f1-4fb7-8d23-3e7f5b818a79"}
{"level":"info","time":"2022-06-01T19:45:08.065481361Z","caller":"vanilla/controller.go:1112","msg":"Volume \"403c3770-397e-4b10-bc4d-76deb9ad93e8\" attached successfully to node \"420234a6-6215-39e5-4a89-d7bd17671b9e\".","TraceId":"17f8a582-f3f1-4fb7-8d23-3e7f5b818a79"}

Migration CRD is updated and protectvolumefromvmdelete is set to true.

# kubectl get cnsvspherevolumemigrations -o yaml
apiVersion: v1
items:
- apiVersion: cns.vmware.com/v1alpha1
  kind: CnsVSphereVolumeMigration
  metadata:
    creationTimestamp: "2022-06-01T21:24:20Z"
    generation: 2
    name: 403c3770-397e-4b10-bc4d-76deb9ad93e8
    resourceVersion: "2125204"
    uid: a4cbc337-6425-4b1c-b4ab-23937ded716b
  spec:
    protectvolumefromvmdelete: true
    volumeid: 403c3770-397e-4b10-bc4d-76deb9ad93e8
    volumepath: '[vsanDatastore] 949d9662-8812-1c79-7565-0200ad8e5983/kubernetes-dynamic-pvc-71c27d29-2ab0-4850-9733-fda3b9236625.vmdk'
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Deleted Pod and re-created Pod.

root@k8s-control-227-1653523650:~# kubectl delete pod pod
pod "pod" deleted


root@k8s-control-227-1653523650:~# kubectl create -f vcp-pod.yaml 
pod/pod created


# kubectl get pods
NAME   READY   STATUS    RESTARTS   AGE
pod    1/1     Running   0          66s

Observed already protected log

{"level":"info","time":"2022-06-01T19:46:40.979946722Z","caller":"migration/migration.go:260","msg":"migrated volume with ID: "403c3770-397e-4b10-bc4d-76deb9ad93e8" is already protected from vm deletion","TraceId":"0033f128-f3c8-489b-a06d-37307c0b54c7"}

Confirmed flag is properly set using mob

pr-1762-keep-after-delete-vm

Special notes for your reviewer:

Release note:

protect migrated volume from node vm deletion

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 31, 2022
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 31, 2022
@divyenpatel divyenpatel force-pushed the protect-migrated-volume-from-vm-deletion branch 2 times, most recently from 7446f5e to b510a83 Compare May 31, 2022 20:12
@divyenpatel divyenpatel changed the title protect migrated volume from node vm deletion WIP - protect migrated volume from node vm deletion Jun 1, 2022
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 1, 2022
@divyenpatel divyenpatel changed the title WIP - protect migrated volume from node vm deletion protect migrated volume from node vm deletion Jun 1, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 1, 2022
@divyenpatel divyenpatel force-pushed the protect-migrated-volume-from-vm-deletion branch from 5813059 to ff8df3d Compare June 1, 2022 21:30
@divyenpatel divyenpatel force-pushed the protect-migrated-volume-from-vm-deletion branch 2 times, most recently from 9f44694 to fd1c431 Compare June 1, 2022 22:11
Copy link
Collaborator

@chethanv28 chethanv28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me.
As a follow-up we need a released version of govmomi & unit tests, which will be added later.
/approve

@divyenpatel divyenpatel force-pushed the protect-migrated-volume-from-vm-deletion branch 2 times, most recently from bea12da to 7a6fa60 Compare June 2, 2022 17:01
@divyenpatel
Copy link
Member Author

@chethanv28 I have addressed all your review comments.

@divyenpatel
Copy link
Member Author

Changes look good to me. As a follow-up we need a released version of govmomi & unit tests, which will be added later. /approve

@dougm has helped create new govmomi release required for this PR.
I am bumping up govmomi to - https://github.com/vmware/govmomi/commits/v0.27.5 in this PR.

@divyenpatel divyenpatel force-pushed the protect-migrated-volume-from-vm-deletion branch from 7a6fa60 to 98b6fea Compare June 2, 2022 18:15
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 2, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chethanv28, divyenpatel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [chethanv28,divyenpatel]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@chethanv28
Copy link
Collaborator

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 2, 2022
@k8s-ci-robot k8s-ci-robot merged commit d588ca5 into kubernetes-sigs:master Jun 2, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Jun 2, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Jun 2, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Jun 2, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Jun 2, 2022
@divyenpatel divyenpatel added this to the vSphere CSI Migration milestone Jun 3, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Jun 3, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Jun 3, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Jun 3, 2022
divyenpatel added a commit to divyenpatel/vsphere-csi-driver that referenced this pull request Jun 3, 2022
k8s-ci-robot pushed a commit that referenced this pull request Jun 3, 2022
…1762) (#1767)

* protect migrated volume from vm deletion (#1762)

* fixes to run prow tests
k8s-ci-robot pushed a commit that referenced this pull request Jun 8, 2022
…1762) (#1768)

* protect migrated volume from vm deletion (#1762)

* fixes to run prow tests
k8s-ci-robot pushed a commit that referenced this pull request Jun 8, 2022
…1762) (#1765)

* protect migrated volume from vm deletion (#1762)

* fixes to run prow tests
return err
}
// Set up the VC connection
err = m.virtualCenter.ConnectVslm(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What endpoint is this code connecting to? Is this connecting to the Pandora(sps) endpoint? Pandora is removed in vSphere 8.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants